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Summary 

Economists  and  accountants  have  followed  the  lead  of  other  scientists 
in  resorting  to  ordinary  least  squares  (OLS)  regression  for  the  estimation 
of  costs.   In  so  doing  they  have  not  given  adequate  consideration  to  the 
loss  function  associated  with  errors  of  OLS  cost  estimates.   A  different 
approach  to  cost  estimation  having  a  different  loss  function,  i.e.,  the 
minimization  of  absolute  deviation  (MAD)  is  suggested.   The  technique  for 
MAD  estimation  is  shown  to  be  a  simplified  version  of  an  L.P.  goal  pro- 
gramming model.   A  brief  discussion  compares  the  estimation  statistics  of 
OLS  and  MAD  approaches. 


MAD  Cost  Estimation 

Statistical  cost  estimation  techniques  have  been  introduced  into 
the  managerial  accountant's  tool  kit  for  the  last  two  decades.   The  main 
use  of  these  tools  has  been  in  cost  recognition.  Management  accountants 
derive  estimated  cost  relationships  in  order  to  establish  standard  costs 
and  to  provide  budget  officers  with  predictive  models.   Either  because 
the  applications  of  statistical  tools  have  been  straightforward  or  be- 
cause much  of  the  data  utilized  is  of  a  proprietary,  in-house  nature, 
relatively  few  of  these  studies  have  appeared  in  the  accounting  litera- 
ture.  In  those  studies  which  have  been  published,  the  only  technique 
applied  and  reported  is  that  of  ordinary  least  squares  (OLS)  regression 
analysis  [e.g.,  McClenon,  1963;  Benston,  1966;  Comiskey,  1966]. 

Before  we  suggest  an  alternative  to  OLS  regression  analysis,  it  may 
be  useful  to  recall  a  few  facts  about  the  origins  of  OLS.  The  method 

is  attributed  either  to  LeGendre  or  to  Gauss,  who  contested  with  each 

2 
other  its  first  use  back  in  the  18th  century.   Gauss  applied  OLS  to 

interpret  astronomical  data  and  defended  the  method  because  its  linear 

estimator  had  the  property  of  unbiasedness  and  minimized  the  variance 

of  the  estimators  (Gauss-Markov  Theorem) .   Throughout  the  19th  century, 

in  genetics,  biology,  agriculture,  etc.  scientists  resorted  to  OLS 

to  infer  the  true  nature  of  their  sciences. 


In  1960  J.  Johnston  reported  on  empirical,  econometric  studies 
done  for  a  number  of  Industrial  and  financial  sectors.  Statistical  Cost 
Analysis,  McGraw  Hill. 

2 
Plackett,  R.  L. ,  "The  Discovery  of  the  Method  of  Least  Squares," 

Biometrika.  1972,  59,  pp.  239-51. 
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The  general  approach  of  these  men,  as  they  relied  on  classical 
statistical  methods  in  general  and  on  OLS  in  particular,  has  been  char- 
acterized as  that  of  inference.  The  inference  school,  led  by  R.  A. 
Fisher,  considered  statistics  as  a  means  of  processing  data  into 
scientific  relationships  so  that,  by  use  of  observations  and  experimen- 
tation, uncertainty  about  these  relationships  might  be  reduced  in  an 

3 

unbiased  manner.   Inference,  thus  simply  dealt  with  information  and 

did  not  allow  itself  to  be  influenced  by  the  implications  of  the  con- 
clusions reached  by  the  observer.   For  example,  a  statistician  helping 
a  biologist  examine  the  bacterial  density  of  a  reservoir,  could  not, 
qua  statistician,  permit  his  concern  about  consequences  of  certain 
infestation  levels  to  affect  how  his  calculations  would  be  made. 

However  in  the  last  thirty  years  a  new  approach,  that  of  decision 
theory,  arose  to  challenge  inference.   Ferguson  [19  76]  has  summarized 
the  four  basic  elements  of  this  decision  model: 

1)  The  space  9  of  states  of  nature,  one  of  them  "true"  but  unknown 
to  the  statistician. 

2)  The  space  A  of  action  available  to  the  decision  maker. 

3)  The  loss  function  L(9,a)  representing  the  loss  for  taking 
action  a  e  A  when  the  true  state  of  nature  is  9  e  9. 

4)  An  experiment  yielding  observation  X,  the  distribution  of 
which  depends  on  the  true  state  of  nature,  and  which  will 
help  minimize  L(0,a)  from  taking  action  a. 


3 

Another  major  reason  for  scientific  reliance  on  OLS  is  alleged  to 

have  been  the  ease  of  computation  (no  minus  signs,  differentiability, 
etc.).  I-Jhether  this  explanation  is  valid  or  not,  the  authors  in  their 
brief  survey  of  the  history  of  statistics  have  found  very  little  evidence 
for  this  claim. 
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It  is  interesting  to  note  that  points  2)  and  3)  required  specifi- 
cations which  the  proponents  of  inference  had  been  unwilling  to  make 
and  that  statisticians  henceforth  did  build  into  their  methodology  a 
concern  for  consequences.  Perhaps  decision  theory's  greatest  contri- 
bution has  been  to  define  a  loss  function  explicitly,  for  without  a 
loss  function  it  has  been  impossible  to  suggest  action  to  be  taken 
imder  uncertainty  such  as  to  minimize  detrimental  consequences  (or  such 
as  to  maximize  utility)  among  all  possible  states  of  nature. 

In  fact  the  technique  of  OLS,  a  well  established  tool  of  inference, 
implies  a  loss  function,  and  one  which  has  been  described  as  an  epistemic 
loss  function,  i.e.,  a  purely  intellectual  or  cognitive  one.  We  shall 
argue  that  economists,  and  subsequently  accountants,  in  their  increasing 
reliance  on  OLS  for  cost  estimation,  have  not  given  adequate  thought  to 
the  nature  of  this  loss  function  nor  given  much  thought  to  a  decision 
theoretic  approach  to  cost  estimation,  an  activity  which  is  engaged  in 
primarily  for  decision  making  purposes. 

This  paper  will  examine  the  loss  function  implicit  in  the  ordinary 
least  squares  (OLS)  estimation  techniques  and  offer  an  alternative 
technique  of  estimation  based  on  a  different  loss  function  and  on  a  more 
explicit  statement  of  the  costs  of  misestimation.   This  paper  does  not 
assert  the  superiority  of  one  method  over  the  other.   It  merely  attempts 
to  help  arrive  at  a  better  understanding  of  the  assumptions  of  regress- 
ion analysis  and  to  offer  a  method  of  computation  when  an  alternative 
class  of  loss  function  seems  more  appropriate  for  a  given  situation. 


4 

T.  S.  Ferguson,  On  the  History  of  Statistics  and  Probability, 

edited  by  D.  B.  Owen,  "Development  of  the  Decision  Model,"  pp.  335-6. 
The  above  summary  relies  heavily  on  Ferguson's  chapter. 
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In  order  to  facilitate  the  discussion  we  will  use  as  an  example  a 
problem  which  was  included  in  the  December,  1974  CMA  examination: 

The  Ramon  Co.  manufactures  a  wide  range  of  products  at 
several  different  plant  locations.   The  Franklin  Plant, 
which  manufactures  electrical  components,  has  been  experi- 
encing some  difficulties  with  fluctuating  monthly  overhead 
costs.   The  fluctuations  have  made  it  difficult  to  estimate 
the  level  of  overhead  that  will  be  incurred  for  any  one 
month. 

Management  wants  to  be  able  to  estimate  overhead  costs 
accurately  in  order  to  plan  its  operation  and  financial 
needs  better.   A  trade  association  publication  to  which 
Ramon  Co,  subscribes  indicates  that,  for  companies  manufac- 
turing electrical  components,  overhead  tends  to  vary  with 
direct-labor  hours. 

One  member  of  the  accounting  staff  has  proposed  that 
the  cost  behavior  patteim  of  the  overhead  costs  be  deter- 
mined.  Then  overhead  costs  could  be  predicted  from  the 
budgeted  direct-labor  hours. 

Another  member  of  the  accounting  staff  suggested  that 
a  good  starting  place  for  determining  the  cost  behavior 
pattern  of  overhead  costs  would  be  an  analysis  of  histori- 
cal data.  The  historical  cost  behavior  pattern  woiild  pro- 
vide a  basis  for  estimating  future  overhead  costs.   The 
methods  proposed  for  determining  the  cost  behavior  pattern 
Included  the  high-low  method,  the  scattergraph  method, 
simple  linear  regression,  and  multiple  regression.   Of 
these  methods  Ramon  Co.  decided  to  employ  the  high-low 
method,  the  scattergraph  method,  and  simple  linear  regres- 
sion.  Data  on  direct-labor  hours  and  the  respective  over- 
head costs  incurred  were  collected  for  the  previous  two 
years.   The  raw  data  follow: 


19_3 
Direct-Labor   Overhead 
Hours        Costs 


19_4 
Direct-Labor   Overhead 
Hours        Costs 


January 

February 

March 

April 

May 

June 

July 

August 

September 

October 

November 

December 


20,000 
25,000 
22,000 
23,000 
20,000 
19,000 
14,000 
10,000 
12,000 
17,000 
16,000 
19,000 


$84,000 
99,000 
89,500 
90,000 
81,500 
75,500 
70,500 
64,500 
69,000 
75,000 
71,500 
78,000 


21,000 
24,000 
23,000 
22,000 
20,000 
18,000 
12,000 
13,000 
15,000 
17,000 
15,000 
18,000 


$86,000 
93,000 
93,000 
87,000 
80,000 
76,500 
67,500 
71,000 
73,500 
72,500 
71,000 
75,000 
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Using  linear  regression,  the  following  data  were  obtained: 

Coefficient  of  determination  .9109 

Coefficient  of  correlation  .954A 

Coefficients  of  regression  equation 

Constant  39,859 

Independent  variable  2.15A9 

Standard  error  of  the  estimate         2,840 

Standard  error  of  the  regression 
coefficient  for  the  independent 
variable  .1437 

True  t-statistic  for  a  95%  con- 
fidence interval  (22  degrees 
of  freedom)  2.074 

The  problem  asks  the  students  to  construct  the  overhead  cost  func- 
tion using  the  results  of  OLS  computations  and  to  defend  the  superiority 
of  the  OLS  technique  to  the  less  sophisticated  techniques  such  as  HI-LO, 
visual  curve  fitting,  etc.   The  cost  pattern  estimated  in  the  problem 
by  OLS  is 


y .  =  a  +  bx.  +  e. 

=  $39,859  +  $2.159x.  +  e. 


where   y.  the  dependent  variable  is  the  overhead  cost  of  month  i 

X,  the  independent  variable  is  the  direct  labor  hours  for  month  i 

e.  is  the  random  error  of  the  estimate  for  month  i 

1 

a  the  intercept,  is  the  estimated  "fixed"  overhead  for  one  month 

b  the  slope,  is  the  estimated  "variable"  overhead  for  one 
hour  of  direct  labor 

In  general  economists,  starting  from  the  production  function,  posit 

a  cost  function  y  =  f(q)  in  which  cost  is  a  function  of  output  q.   Accountants 

more  often  define  cost  equations  in  which  cost,  y  =  g(x. ,  x_,  ...,  x  ), 

where  x  represent  inputs.   The  above  problem  is  of  the  latter  nature, 

i.e.,  a  cost  equation.   In  either  case,  because  of  the  stochastic  nature 
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of  the  cost-volume  relationships  under  examination  and/or  because  of 
technical  limitations  in  model  building  and  measurement,  economic  or  cost 
accounting  forecasts  (budgets)  do  not  correspond  exactly  to  cost  observa- 
tions.  Statistical  techniques  will  help  in  the  assessment  of  the  "true" 
underlying  relations  between  inputs  and  outputs  and  their  cost.   Given 
the  randomness  of  the  cost  relationship  and  measurement  errors,  no 
statistical  estimates  will  yield  error-free  cost  functions. 

In  a  decision  theory  context,  estimates  (y,  )  are  predictions 
(signals)  of  uncertain  states  of  nature,  which  will  be  used  in  evaluating 
alternative  actions.  Accordingly  the  estimated  cost  functions  (or 
equations)  [y  =  f(q)]  or  [y  =  g(x)]  can  be  viewed  as  information  sys- 
tems.  In  general,  perfectly  accurate  predictions  will  lead  to  better 
decisions  than  will  erroneous  ones.   But,  as  indicated  earlier,  a 
perfect  cost  fimction  would  be  unobtainable,  even  if  it  existed. 
Further,  the  consequences  of  various  prediction  errors  are  dependent 
upon  the  decision  problem  at  hand  and  cannot  be  generalized.   Since 
cost  functions  are  estimated  to  facilitate  decision  making,  the 
criterion  for  the  parameter  values  of  the  cost  function  should  be 
compatible  with  the  decision  objective,  stated  as  a  loss  function. 

We  can  formulate  the  cost  function  estimation  problem  as  an 
optimization  problem.   The  objective  of  the  problem  is  to  select  para- 
meter values  such  that  the  consequences  of  the  differences  between  the 
estimated  values  and  the  observed  values  are  at  minimum.  The  manner 
in  which  the  differences  are  measured,  weighed  and  accumulated  will 
determine  the  specific  form  of  the  objective  function  in  the  optimiza- 
tion problem. 
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Thus  one  can  argue  that  the  structure  of  the  objective  function  in 
the  estimation  problem  is  directly  related  to  the  loss,  or  utility  func- 
tion in  the  decision  problem.   The  following  analysis  is  based  on  the 
assumption  that  the  selection  of  the  objective  function  can  be  guided 
by  the  knowledge  of  the  loss  function  in  the  decision  problem.  We 
suggest  that  the  loss  function  related  to  cost  estimation  be  linear, 
(instead  of  quadratic)  and  this  paper  will  therefore  offer  a  family 
of  objective  functions  which  are  linear. 

Closer  examination  of  the  OLS  method  implies  that  the  following 
loss  function  is  being  used  in  the  determination  of  cost  behavior: 

(1)  the  penalties,  c(e,),  associated  with  positive  or  negative 
errors,  are  identical.  That  is,  the  magnitude,  not  the 
direction  of  error,  is  the  sole  determinant  of  the  penalty: 

c(e,)  =  c(e  )  if  and  only  if  [e.]  =  [e.[ 

(2)  The  relative  penalties  of  the  errors  are  the  squares  of  the 
relative  magnitude  of  the  errors: 

c(e^)/c(e^)  =  (le^l/le^l)^ 

The  above  loss  function  means,  among  other  things,  that 

(1)  The  same  magnitude  of  unfavorable  and  favorable  variances  is 
equally  significant  in  a  standard  costing  system;  or  for  a 
pricing  decision,  and 

(2)  A  $500  variance  is  25  times  worse  than  a  $100  variance  due  to 
the  squaring  aspect  of  the  relative  magnitude  of  the  errors. 

Of  course,  there  are  many  situations  in  which  neither  economists  nor 

accountants  will  be  satisfied  with  the  above  implications.  As  a  matter 

of  fact,  there  is  no  reason  to  believe  that  all  cost  estimators  would 
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consclously  choose  such  a  quadratic  loss  function.   The  issue  then  is; 
how  can  we  estimate  costs  by  a  technique  having  a  loss  function  other 
than  the  quadratic  one?  The  following  section  presents  a  computational 
technique  with  a  set  of  alternative  loss  functions  that  can  be  used. 

Cost  Estimation  by  Minimization  of  Absolute  Derivatives  (MAD) 

The  criteria  proposed  for  cost  estimation  as  an  alternative  to  OLS 
are  the  following: 

(1)'   The  direction  of  errors,  as  well  as  the  magnitude  of 
errors,  is  pertinent  in  determining  the  cost  of 
errors.   Formally  stated: 

c(e^)  =  kc(e  )  ,   e^  =  -e 

where  k  is  a  positive  weighting  factor. 

(2)'   The  relative  penalty  of  the  errors  is  proportional  to 
the  relative  magnitude  of  the  errors: 

c(e,)/c(e.)  =  e./e.         e  x  e.  >_  0 

A  regression  model  which  estimates  coefficients  of  the  cost  equation 
and  which  at  the  same  time  satisfies  the  above  criteria  is  based  on  a 
simplified  version  of  the  l.p.  goal  programming  model.   Assume  the  cost 
equation  takes  the  traditional  form. 


y .  =  a'  +  b 'x.  +  el     for  i  observations  =  1,  2,  .,,,  24 

The  values  of  the  coefficients,  a'  and  b',  must  be  estimated  so  as  to 
minimize  the  sum  of  the  absolute  values  of  the  error  terms,  e' ,  viz 
[Wagner,  1959], 


Min  Z      la!  +x  b'  -  y    \ 
iel   J     -'  -'    ^ 
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where  x,  and  y.  are  constraints  (tuples  of  observation  values) ,  a'  and  b' 
are  the  activity  levels  to  be  developed  by  the  algorithm  and  the  vertical 
strokes  mean  the  absolute  value  of  the  expression  they  enclose. 

It  is  important  to  understand  that  a'  and  b',  coefficients  of  the 
cost  equation,  have  now  become  activity  variables  of  the  L.P.  program; 
also  that  x,  the  independent  variable,  and  y,  the  dependent  variable  of 
the  cost  equation  have  become  constants  in  the  objective  function. 

If  deviations,  e' ,  can  be  related  to  observation  tuples,  it  is 
clear  by  definition  that  the  objective  function  can  be  restated  as 

+    —  +    — 

Min  E   [e,  +  e,]     for  e'  =  e.  +  e.. 

iel 
And  indeed,  Chames,  Cooper  and  Ferguson  have  shown  the  equivalence  in 
the  linear  programming  format  of  the  above  objective  function  developed 
by  Wagner  and  of 

Min  Z      [e^  +  e~] 

i£l  -  - 

s.t.  a'  +  X. ,b!  -  e,  +  e.  =  y.  for  1  =  1,  2,  -  -  -,  n 
j    ij  J    i    i    1 

a',  b',  e^,  e~  >_0 


Concerned  by  overemphasis  on  outliers.  Sharp  [1971]  applied  the  same 
MAD  criterion  in  security  risk  analysis.   For  highly  diversified  port- 
folios, he  found  differences  in  OLS  &  MAD  to  be  relatively  small  and 
reached  the  tentative  conclusion  that  gains  from  use  of  MAD  would  be 
modest. 

Chames,  A.,  Cooper,  W.  W.  &  Ferguson,  R.  0.,  "Optimal  Estimation  of 
Executive  Compensation  by  Linear  Programming,"  Management  Science  1  (1955) 
pp.  138-51. 
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where  e.  represents  all  overage  or  deviation  above  the  regression 
line,  and  e.  all  underage  or  a  deviation  beneath  the  regression  line. 
This  formulation  will  be  recognized  as  the  goal  programming  first 

developed  by  Chames  &  Cooper,   introduced  to  the  accounting  liter- 

g 
ature  by  Ijiri  and  made  the  subject  of  book  length  study  by  Sang  M. 

Lee  [1972]. 

This  model  is  a  very  simplified  version  of  goal  programming  because 
there  is  no  problem  with  goal  dimensionality  (all  deviations  are  measured 
in  $)  and  because,  therefore,  no  preemptive  priorities  are  established 
for  the  goals  (i.e.,  the  observations  to  be  fitted  to  a  regression  equa- 
tion) or  applied  to  the  deviations  from  these  goals. 

Referring  to  the  sample  problem,  the  constraint  set  can  be  further 
identified  as  follows 

a^  -  32  +  x^b^  -  x^b^  ~  ^i_  "^  ^1  ^  ^± 


or 


(1)  ^1  "  ^  "*■  20,000b     -  20,000b2   -  e^  +  e^  =  84,000 

(2)  a^  -  32  +  25,000b^  -  25,G00b2  "  ^2  ^  ^2  "  ^9,000 


(24)      ^1  "  ^  ^  18,000b^  -  18,000b2  -  e2^  +  e"^  =  75,000 


Chames,  A.  &  Cooper,  W.  W.   Management  Models  and  Industrial  Applications 
of  Linear  Programminp:  (Wiley  1961)  refer  also  "Goal  Programming  &  Multiple 
Objective  Optimization,"  European  Jour,  of  0.  R.  (1977)  pp.  39-54. 

g 

Ijiri,  Y.   Management  Goals  and  Accounting  for  Control  (Amsterdam: 
North  Holland  1965),  reference  also  L.  N.  Killough  and  T.  L.  Souders, 
"Goal  Programming  for  Public  Accounting  Firms,"  The  Accounting  Review 
April  73,  pp.  2  68-279. 
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The  cost  equation  coefficients,  a'  and  b',  which  are  now  the  decision 
variables  of  the  l.p.  model,  have  been  replaced  by  pairs  (a. ,a»)  and 
(b  ,b»)  because  of  the  linear  programming  requirement  that  the  values 
of  all  variables  be  non-negative.   Only  one  variable  in  each  pair  will 
take  on  a  value  while  the  other  remains  at  zero.   If  the  intercept  is 
positive  a.  will  assume  that  value  whereas  a_  will  take  on  the  inter- 
cept value  if  it  is  negative.   Likewise  b  and  b^  represent  positive  and 
negative  slopes  respectively.   Finally  the  error  terms  e  and  e,  repre- 
sent positive  or  negative  deviation  terms  associated  with  the  i   set  of 
observations.   Again  only  one  of  these  terms  per  equation  can  take  on  a 
value.   Stated  in  more  familiar  notation 


Minimize  x„q  +  x„-  (Prob.  1) 

s.t./(l)  X  -  X.  +  20,000x_  -  20,000x,  -  xl"  +  x~  =  84,000 
(2)   X  -  X2  +  25,000x  -  25,000x^  -  x^  +  Xg  =  99,000 


'(24)  X  -  X2  +  13,000x  -  18,000x,  -  x^g  +  x~g  =  75,000 


and 


_++         +       _- 

S  •  C  •         "•  Xq  ^  X/-  ""  •  •  •  ^  X«  Q      I   X^  Q  ""  ^ 


and  -  x^  -  Xg  -  . . .  -  x^g  +  X3Q  =  0 


all  x^s  >_  0 
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The  first  24  constraints  have  the  form  [D]. 


r^r 

■^1 

^2 

• 

h 

• 

• 
• 

-V 

where  [D]: 


1  -  1  +  obs.l  -  obs.l  -1+1 

1  -  1  +  obs.2  -  obs.2         -1+1 


[l  -  1  +  obs.24  -  obs.24 


-1  +  1 


The  last  two  constraints  aggregate  the  positive  and  negative  errors 
respectively.  I-Jhile  not  necessary,  this  formulation  will  provide  a 
convenient  way  of  weighting  positive  and  negative  errors  differentially. 
If  we  choose  to,  we  can  multiply  x--  and  x_„  by  different  weights. 

For  our  example  as  shown  in  Table  1  and  Figure  2  the  values  of 
a'  and  b'  are  +40,170  and  +2.167  respectively,  which  are  quite  similar 
to  that  of  OLS  estimates.   In  the  next  section  we  shall  show  that  for 
a  s}-miiietric  distribution  of  error  terms,  this  result  should  be  expected. 

In  the  usual  goal  programming  problem  the  several  goals  of  the 
optimizer  which  become  part  of  the  constraint  set  and  which  may  conflict 
with  one  another  represent  operational  conditions  [refer  S.  M.  Lee, 
Chapters  8  to  14,  which  refer  to  production  planning,  marketing,  cor- 
porate planning,  medical  care  planning,  etc.].   The  separate  constraints 
(goals)  may  or  may  not  have  common  dimensions,  and  when  they  do  not. 
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they  can  only  be  ranked  ordinally,  i.e.,  by  preemptive  priorities. 
Such  priorities  are  then  assigned  to  the  slack  in  the  goal  constraints 
(deviation  variables)  so  that  deviations  can  be  suinmarized  in  a  pre- 
scribed order.  Within  the  particular  subsets  of  constraints  classified 
under  one  priority,  additional  refinement  of  goal  attainment  can  be 
achieved  by  the  assignment  of  different  weights  to  deviated  variables. 
In  our  cost  estimation  goal  program,  the  goals  are  all  cost  fit- 
ting statements  of  the  same  dimension  and  typically  one  set  of  obser- 
vations is  no  more  important  than  the  next  in  contributing  to  the 
definition  of  the  cost  relationship.   For  this  reason  (and  because  of 
the  single  monetary  dimension) ,  no  preemptive  priorities  need  to  be 

established  and  the  cost  estimation  process  will  be  deliberately  in- 

9 
different  to  fitting  one  constraint  instead  of  another.   On  the  other 

hand,  negative  deviations  may  be  viewed  more  seriously  than  positive 

ones  (or  vice  versa)  and  weighted  accordingly  with  a  corresponding  effect 

on  the  MAD  loss  function.  The  objective  function  would  then  become 


Min  wx„Q  +  vx_»        for  w  &  v,  subjective  weightings  of  the 

positive  and  negative  deviations. 


For  example,  suppose  a  company  is  considering  a  bid  on  a  job.   If 
cost  estimation  is  too  low  (positive  error) ,  the  company  suffers  a 
reduction  in  profit.   If  the  estimation  is  too  high  (negative  error) , 
the  company  unnecessarily  reduces  its  chances  of  obtaining  the  job. 


9 
Ignizio  [19  78]  provides  a  nuclear  engineering  cost  example,  in  which 

certain  parameter  values  must  fall  within  a  given  range  due  to  physical 

limitations.  He  thus  assigns  preemptive  priorities  to  the  objective 

of  estimating  these  parameters  over  others. 
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Let  us  assume  that  management  of  the  firm  assesses  that  the  penalty 

or  cost  associated  with  positive  errors  (underestimation)  is  twice  as 

severe  as  that  associated  with  negative  errors  (overestimation).   In 

such  a  case,  the  objective  function  for  estimation  purposes  can  be 

adjusted  as  follows: 

Minimize  x„q  +  2x__ 

^^     "*"  (Prob.  2) 

Constraints  are  same  as  Problem  1. 

The  estimated  coefficient  values  for  this  problem  are  an  intercept  (a') 
of  $42,750  and  a  slope  coefficient  (b')  of  $2,062  per  unit  (see  Table  1 
and  Figure  1) . 

On  the  other  hand  if  management  wishes  to  use  past  data  for  the 
purpose  of  setting  standards,  it  may  for  motivational  reasons  wish  to 
set  up  a  tight  but  still  attainable  standard.   In  this  circumstance, 
it  is  more  plausible  to  consider  the  following  objective  function: 


Minimize  kx--  +  x__,    k  >  1 

-^    -^^  (Prob.  3) 

Constraints  are  same  as  in  Problem  1. 


For  k  =  2  we  obtain  the  estimated  intercept  (a')  of  $38,170,  and  co- 
efficient (b')  unit  cost  of  $2,167  (see  Table  1). 

Such  a  L.  P.  goal  programming  formulation  can  easily  be  adjusted 
to  more  than  one  independent  variable  (multiple  regression) .   It  can 
also  accomodate  indicator  variables  to  account  for  effects  not  reflected 
in  the  values  of  the  independent  variables,  as  well  as  allowing  for 
piece-wise-linear  equations  (see  table  1) .  We  can  force  the  fitted 
line  (plane)  to  pass  through  a  given  desired  point  by  any  one  of 
several  means.   For  example,  if  we  believe  that  any  given  tuple  of 
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(x . ,  y^ ) ,  exactly  represents  the  cost  relationship  and  we  wish  the 
regression  to  pass  thru  that  point   ,  we  can  exclude  the  deviation  for 
that  particular  constraint  from  the  aggregation  of  positive  and  negative 
summation,  include  them  separately  in  the  objective  function  appro- 
priately weighted  by  M; 

Min  wx^g  +  vx(30)'  +  Mx^  +  >fx~ 

where  xl-  and  xl_  are  the  aggregation  of  positive  and 
negative  deviations  excluding  the  ith  observation  set. 

In  the  above  case,  due  to  the  weighting  factor,  the  estimates  of 

the  MAD  estimates  will  not  be  the  same  as  OLS  estimates  even  if  the  data 

were  distributed  such  that  the  errors  terms  satisfy  the  OLS  assumptions, 

i.e.,  of  independent  and  identical  normal  distributions. 

Non-Symmetric  Error  Distribution 

Next,  we  consider  the  case  of  non-normal,  non-symmetric  error 
distributions.   The  measure  of  central  tendency  that  minimizes  the 
sum  of  the  squared  deviations  is  the  mean  or  the  average  (OLS  esti- 
mates) [Gauss,  1821],  whereas  the  measure  that  minimizes  the  sum  of 
absolute  deviations  is  the  median  of  the  distribution  (Minimum  Ab- 
solute Deviation  (MAD)  estimates)  [Laplace,  1812].   For  symmetric 
distributions,  the  mean  and  the  median  are  the  same;  estimates  of 
parameters  will  tend  to  be  similar  where  the  distribution  of  error 
terms  is  symmetric  and  then  we  would  expect  the  OLS  and  MAD  estimates 


If  that  point  is  given  by  the  set  of  sample  meanSj_  we  can  ajdd  _ 
to  the  constraint  set  the  linear  restriction  a  +  b^x.  ...  b  x  =  y 
(Wagner,  1959,  p.  208).  ^  ^  ™° 
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to  be  quite  similar.    The  roles  that  these  central  tendency 
measures  play  have  been  discussed  in  the  accounting  literature  by 
Barefleld  [1969]  and  others  [Peterson  and  Miller,  1964].   Our  dis- 
cussion is  not  to  repeat  the  properties  of  the  measures,  but  to 
demonstrate  hov  those  measures  can  be  estimated  using  the  MAD  esti- 
mation technique.  For  convenience,  we  will  use  a  non-symmetric 
triangular  distribution  of  error  terms  as  the  example  for  analysis. 
The  range  of  distribution  is  (0,3)  and  the  peak  point  is  2  as  shown 
in  figure  3.  The  mode  of  the  distribution  is  2  and  the  mean  and 
median  are  5/3  and  1.732  respectively  (Details  are  shown  in  Appendix). 


Insert  Figures  3  and  4  about  here 

Given  this  distribution,  the  estimator  which  minimizes  the  sum  of 

2 

squared  errors  (le.  )  is  the  mean,  5/3;  the  sum  of  absolute  deviations 

(Z|e.|)  is  the  median,  1.732;  and  the  mode  can  be  estimated  using  the 
weighted  linear  loss  function,  s|e, |  +w|e.[  (again  see  Appendix  for  details), 

The  point  to  be  stressed  is  that  the  MAD  procedure  can  be  used 
either  to  reflect  the  non-quadratic  loss  functions  (figure  1)  or  to 
estimate  various  central  measures  when  the  errors  are  not  distributed 
symmetrically  (figures  2  and  3).   Table  2  summarizes  these  alternatives. 


Insert  Table  2  about  here 


However,  Wilson  [19  78]  found  that  when  outliers  are  present  MAD 
estimates  are  significantly  more  efficient  than  OLS  estimates. 
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We  have  demonstrated  how  the  values  of  the  coefficients  can  be  obtained. 
The  key  remaining  question  is  how  good  or  useful  are  the  estimates?  This 
issue  is  addressed  in  the  next  section. 

Evaluation  of  the  Results 

One  obvious  measure  of  the  "goodness"  of  estimation  is  the  question 
of  how  much  of  the  variations  in  overhead  costs  is  explained  by  the 
estimated  cost  pattern?  To  answer  this  question  for  MAD  we  shall  start 
by  defining  the  term  total  variation. 

The  OLS  estimates  minimizes  the  sum  of  the  squared  errors.  There- 
fore, the  total  variations  should  also  be  expressed  in  terms  of  squared 
deviations.  Had  the  direct  labor  hours  information  not  been  available 

to  us,  the  estimate  of  the  overhead  cost  that  minimizes  the  sum  of  squared 

2 
errors  would  be  the  mean  of  the  overhead  costs:  Z(j.    -  y    )  .   This 

i    mean 

measure  of  variation  is  called  the  Total  Sum  of  Squares  (TSS) . 

However,  since  the  objective  of  MAD  estimation  is  to  minimize  the 
total  absolute  deviation  (TAD) ,  a  reasonable  measure  of  variation  would 
be  the  sum  of  deviations  from  the  median: 

TAD  =  Z|y.  -  y   ,.   I. 
'  1    medxan' 

We  can  obtain  the  corresponding  measures  of  the  variations  that  have 
been,  or  have  not  been,  accounted  for  by  introducing  direct  labor  hours 

as  an  explanatory  variable.   For  OLS  estimation  it  is  the  Sum  of  Squared 

2 
Errors  (SSE  =  S  e  ) ,  and  for  MAD  estimation  it  is  the  Sum  of  Absolute 

Deviations  (SAD  =  z|e, |),  which  is  the  value  of  the  objective  function  in 

the  L.P.  formulation.   We  can  then  use  these  measures  to  obtain  an  index 

of  the  deviations  explained  by  the  estimated  cost  pattern. 
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Using  Problem  1  as  an  example,  for  OLS  regression: 

Ze  2 

^^^l-^'mean^ 

and  for  MAD  estimation: 

SAD       ^'^il 

Z  =  l-ff^=l i =  .6952. 

TAD       „|       I 

^IVymed" 
Another  measure  that  can  be  used  for  evaluation  of  the  cost  equation 

is  the  average  error  associated  with  the  estimated  costs.   In  OLS  the 

average  error  is  labeled  the  "standard  error  of  the  estimates  (SE)." 

SE=   |2I_.  i!i!.  i^ZiipS.  2340.16 

d.f.     a.£.         22 

d.f.  is  degrees  of  freedom  and 
1.77463E8  =  1.77^63x10^. 


As  discussed  in  the  Appendix  I,  of  the  24  observations  due  to  the 
estimation  technique  only  22  error  terms  are  free  to  take  on  values. 
Thus  the  sum  of  squared  errors  are  divided  by  22  to  obtain  the  average. 
Similarly,  the  mean  deviation  associated  with  MAD  estimates  (which  also 
contains  22  degrees  of  freedom)  can  be  calculated  as  follows 

MT.  -  SAD_   ^l^i'  _  55,170   „n7  7 

One  note  of  caution  is  in  order.  P.Tiile  it  can  be  argued  that  the  measures 
R"^ ,  SE  and  Z,  MD  possess  similar  properties,  these  measures  should  not  be 
used  for  the  purpose  of  comparing  the  two  estimation  techniques.   Both 
methods  of  estimation  yield  optimal  solutions  with  respect  to  their  given 
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criteria.   CTioosing  between  the  two  methods  should  be  based  on  the 
manager's  judgment  as  to  which  set  of  criteria  (OLS  or  MAD)  is  more  ap- 
propriate for  the  situation  at  hand.  That  is,  the  choice  of  the  estima- 
tion procedure  (OLS  or  MAD)  should  be  based  on  the  basic  assumptions  of 
the  estimation  procedures,  not  based  on  the  numbers  resulting  from  a 
given  problem;  the  latter  are  not  suitable  for  comparison. 

As  noted  in  the  introduction,  the  purpose  of  this  paper  has  been  to 
point  out  the  inherent  assumptions  of  OLS  regression  analysis,  and  to 
offer  an  alternative  method  of  cost  estimation  should  the  MAD  assumptions 
correspond  more  closely  to  needs  of  management  than  do  the  OLS  assumptions. 
MAD  estimation  can  be  applied  to  more  complex  situations  (e.g.,  multiple 
regression,  indicator  variables,  etc.),  but  it  should  be  recognized  that 
much  of  the  apparatus  developed  for  OLS  technique  is  not  yet  available 
for  MAD.   For  example,  the  statistical  properties  of  the  estimates  and 
the  analysis  of  the  error  terms  and  the  coefficients  have  not  been 
developed  as  well  as  they  have  for  OLS  regression  analysis.   Should  the 
actual  application  of  MAD  in  real  situations  prove  useful,  we  can  expect 
future  studies  to  develop  the  necessary  understanding  and  techniques  for 
further  analysis. 

As  early  as  1821,  Gauss  conceded  that  the  choice  of  a  loss  function 
was  somewhat  arbitrary,  and  that  Laplace's  choice  of  absolute  error  was  no 
more  arbitrary  than  his  own  choice  of  squared  error.   However,  when  we 
cast  the  cost  estimation  problem  in  a  decision  setting,  we  have  the 
basis  for  choosing  an  appropriate  loss  function.   In  this  paper,  we  have 
shox<m  how  a  linear  programming  model  can  be  used  for  estimation  of  the 
cost  patterns  when  the  loss  function  is  linear. 


Appendix 
This  appendix  is  prepared  to  provide  the  computational  details 
for  an  example  used  in  this  paper.  The  triangular  distribution  has 
the  follovrLng  density  function: 

f,\  hx  ,a<x<^c 

^^^^  "    h'(b  -  x)      ,  c  <  X  <.b 

In  the  example  a  =  0,  b  =  3,  and  c  =  2  which  is  mode.   Then,  the 
height  of  the  triangle  k  is  2/3  and  the  slopes  h  and  h'  are  1/3  and 
2/3  respectively.   Initially  we  wish  to  find  the  value  of  the  central 
tendency  measures: 

Mean:   E(x)  =  |  xf(x)dx 

=  J^  x[hx]dx  +  J^  x[h'(b  -  x)]dx 


=  J*^  hx^dx  +  /^[h'bx  -  h'x^]dx 

=  l/3(x^|^)  +  l/2(h'bx^|^)  -  l/3(h'x^|^) 

=  5/3 


Median:   p(x  >_  m)  =  .5 

r  f(x)d  =  f°  hxdx  =  .5 


l/2(hx^r)  =  l/2hm^  =  .5 


h  =  1/3  thus 


n,2  =  3   m  =  3^^^   =  1.732 


A-2 


The  estimate  that  minimizes  the  sum  of  weighted  absolute  deviations. 

Minimize  L  =  /  f(x)g(x)dx  where  f(x)  is  the  probability 

density  function 
g(x)  is  the  loss  function 

That  is,  find  the  estimate  e  such  that  the  panalty 

L  =  /®  f(x)g(x)dx  +  /—  f(x)g(x)dx  is  minimized. 

L  =  J®  hx(e  -   x)dx  +  w  f-  hx(x-e)dx  +  w  f^  h'  (b  -  x)  (x  -  'e)dx 
'a  e  c 

After  solving  this  equation,  we  shall  take  partial  derivatives  to 
obtain  the  value  of  e: 

3L  =  l/2[(h  +  wh)e^  -  ha^  -  whc^  -  wh'b^  +  2wh'bm  -  wh'm^]  =  0 

for      32=  0,  b  =  3,  c  =  2,  h  =  1/3,  h'  =  2/3  and  w  =  1, 
e  =3  thus  the  estimator  is  the  median. 

In  order  to  calculate  the  weight  w  which  would  yield  the  Mode  as  its 
estimator,  we  simply  let,  a  =  0,  b  =  3,  c  =  2,  h  =  1/3,  h'  =  2/3  and 
e  =  c.  Then  w  =  2  which  implies  that  the  loss  function  g(x)  =  e,  +  2ej 
will  yield  the  maximum  likelihood  estimator. 
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The  estimate  that  minimizes  the  sum  of  weighted  absolute  deviations. 

Minimize  L  =  /  f(x)g(x)dx  where  f(x)  is  the  probability 

density  function 
g(x)  is  the  loss  function 

That  is,  find  the  estimate  e  such  that  the  panalty 

L  =  J®  f(x)g(x)dx  +  /-  f(x)g(x)dx  is  minimized. 

Si  c 

L  =  J®  hx(e"  -  x)dx  +     w    f-  hx(x-e)dx  +  w    f^  h'  (b  -  x)    (x  -  e)dx 


After  solving  this  equation,  we  shall  take  partial  derivatives  to 
obtain  the  value  of  e: 

3L  =  l/2[(h  +  wh)e^  -  ha^  -  whc^   -  wh'b^  +  2wh'bm  -  wh'm^]    =  0 

3e 

for      32=  0,  b  =  3,  c  =  2,  h  =  1/3,  h'  =  2/3  and  w  =  1, 
e  =3  thus  the  estimator  is  the  median. 

In  order  to  calculate  the  weight  w  which  would  yield  the  Mode  as  its 
estimator,  we  simply  let,  a  =  0,  b  =  3,  c  =  2,  h  =  1/3,  h'  =  2/3  and 
e  =  c.  Then  w  =  2  which  implies  that  the  loss  function  g(x)  =  e  +  2eJ 
will  yield  the  maximum  likelihood  estimator. 
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Table  2:   Estimation  Methods  Suitable 
for  Different  Conditions 


Estimation               Loss  Central 

Method Function Tendency* 

2 

OLS                    e.  Mean 

MAD                    |e.|  Median 

Weighted               w|e.|+Ie  |  Mode  (MLE) 
MAD                      ^    ^ 


*Assuming  non-synmetric  independent  and  identical  error  distribution. 
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Figure  1 

Alternative  Loss  Functions 
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Figure  2 
Estimated  Cost  Functions 
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f(x):  probability  density  function,   g(x):  loss  function 


Figure  3:  A  Non-Symmetrical  Triangular  Distribution 
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f(x[y):  density  function 
of  the  errors 


Figure  4:  Cost  Equations  Under  Various  Loss  Functions 
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